Liar’s Poker: a tricky problem, part 2

I recently wrote about a card game called Liar’s Poker and my endeavours to find an optimal strategy for the final round. Some extra progress has been made, but it’s still a long ways off. Despite what I wrote when I edited my previous post, I’m no longer convinced that we can adapt Simplex or a gradient descent algorithm to solve this. It turns out that although the gradient function is piecewise linear, there are an exponential number of different linear regions in it, and I’m not convinced we can do something like Simplex without enumerating them all. but here’s some other stuff Reid and I have done on the problem:

We’ve finalized Player 2’s strategy:

  1. If player 1 bid something naming 2 cards (either a pair or a high card with a kicker), call BS. If the bid isn’t there, you win. If it is there, any higher bid you consider will be challenged and lose, so it’s never a bad idea to challenge.
  2. If player 1 bids a high card (without a kicker) and it’s lower than your card, call BS, and you win.
  3. If player 1 bids a high card (without a kicker) and it’s higher than your bid, decide if he’s lying. If so, call BS. If not, call his card high with yours as a kicker.
  4. If player 1 bids your card as a high card, decide on the most likely scenario:
    • Player 1 has card i (where i is lower than your card), and bid your card high with i as a kicker (note that there are i different possibilities here; consider them all)
    • Player 1’s card is at least as high as your card, in which case you challenge the bid.

Note that due to part 1 of this strategy, player 1 should only make a 2-card bid when he has 1 of those cards: if he wants to bid a pair, bid a pair of his card; if he wants to bid a high card with a kicker, his card should be one or the other named.

We’ve also solved it if you have an 8-card deck: player 1 should always bid “ace high with a king kicker.” Player 2’s best bet is to always challenge the bid, but the bid will exist whenever the two cards are different (which occurs 4/7 of the time). Consequently, player 1 will win 4/7 of the time with this strategy. However, here’s a more rigorous proof: I’m going to use some of the same terminology as in the previous post, so P(W) is the probability of player 2 winning, etc. However, for brevity, I will use the following variables:

  • a = a_{K,AH} (chances of bidding “ace high” given a king)
  • b = a_{A,AH} (chances of bidding “ace high” given an ace)
  • c = a_{K,AK} (chances of bidding “ace high with a king kicker,” given a king)
  • d = a_{A,AK} (chances of bidding “ace high with a king kicker,” given an ace)

and instead of having free variables for bidding pairs, they will be simply whatever probability is left over (i.e. a_{K,PK} = 1-a-c and a_{A,PA} = 1-b-d). We can now write P(W) using the formula from the last post and simplify it into:

14*P(W) = max(3a, 4b) + max(4a, 3b) + 3d + 3c + 4(1-b-d) + 4(1-a-c)

We can look at the gradient of this over the different regions. In any region, the derivative of P(W) with respect to c and d is -1, which is to say that player 1 can increase his chances (by decreasing the probability that player 2 wins) by increasing c and d. When 3a>4b, the derivative wrt a is 3 and the derivative wrt b is -4, so in this case player 1 would do better to decrease a and increase b. When 3b>4a, the derivatives with respect to a and b are -4 and 3, respectively, and player 2 should increase a and decrease b. When 4a>3b and 4b>3a, the derivatives wrt ab and b are both 0, so changing a or b won’t help. Therefore, we would like to be in that middle region (where 9a<12b<16a), but within that region, we can do better and better by increasing c and d, and we don’t hurt ourselves by decreasing a and b (so long as that inequality holds). The best we can do is to increase c and d as much as possible, so that c=d=1 and a=b=0, and we have that P(W)=4/7 in this case.

One way I’ve tried to generalize this gradient analysis is to make a directed graph, where every vertex corresponds to a planar region of the objective function (bear with me, and I’ll explain why there aren’t an exponential number of vertices, and then why there are an exponential number of them). An edge will connect vertex U to vertex V iff you can move from a point in region U to a point in region V while decreasing the objective function. That is to say, U and V share a border, and U’s gradient points in that general direction. If two vertices each point to each other, it means that both their gradients slope into each other, and you can do best in that local area by having a strategy on the border between the two. We are now looking for the largest strongly connected component with no outbound edges in this graph: since it’s strongly connected, each region is “downhill” from the other regions in the component, and the best point is at the intersection of them all. and since this component has no outbound edges, we won’t find a better strategy elsewhere.

“But Alan,” I hear you ask, “didn’t you say there were an exponential number of regions?” Yes, I did. There are approximately 2^m regions, where m is the number of max()’s in P(W), since most of them can be satisfied regardless of what’s going on with the others (note that this is a very rough estimation, but I’m quite confident that there are an exponential number of them). However, any given max() will only involve a_{ch}’s related to a single h. That is to say, we can partition the a_{ch}’s into |H| different sets (1 for each bid involved), and every max() will only involve variables from one of these sets. Consequently, we can decouple all the different sets: the choices player 2 makes when player 1 bids “ace high” are independent of the choices involved when player 1 bids “king high,” which are independent of the “queen high” ones, etc. We can therefore make r separate graphs (r is the number of ranks in the deck), and find the strongly connected components in each one independently of the others.

This sounds great, but it’s not yet the end of our problems. Even if we just consider actions we can choose given a single bid from player 1, player 2 still has an exponential number of strategies. If player 1 bids “x high,” for every card i less than x that player 2 might be dealt, he has 2 possible actions. If player 2 was actually dealt card x, he has x possible actions. Consequently, we still have ~2^(x-1)*x sub-strategies for when player 1 bids “x high” (and there are r sets of sub-strategies to consider). This is certainly better than the \prod_{i=2}^r (2^{i-1}*i) possible strategies we had before, but it’s still exponential. Then again, we only need to consider up to rank 13, so this might be tractable already.

Reid has been writing a machine learning system to try to empirically solve this, and if he can get it right our troubles would be over (in a pragmatic sense, but our theoretical knowledge would still be lacking). However, neither of us is convinced yet that his system is learning correctly. We’ll see what happens.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>